As natural language processing (NLP) for gender bias becomes a significant interdisciplinary topic, the prevalent data-driven techniques such as large-scale language models suffer from data inadequacy and biased corpus, especially for languages with insufficient resources such as Chinese. To this end, we propose a Chinese cOrpus foR Gender bIas Probing and Mitigation CORGI-PM, which contains 32.9k sentences with high-quality labels derived by following an annotation scheme specifically developed for gender bias in the Chinese context. Moreover, we address three challenges for automatic textual gender bias mitigation, which requires the models to detect, classify, and mitigate textual gender bias. We also conduct experiments with state-of-the-art language models to provide baselines. To our best knowledge, CORGI-PM is the first sentence-level Chinese corpus for gender bias probing and mitigation.
translated by 谷歌翻译
As an important variant of entity alignment (EA), multi-modal entity alignment (MMEA) aims to discover identical entities across different knowledge graphs (KGs) with multiple modalities like images. However, current MMEA algorithms all adopt KG-level modality fusion strategies but ignore modality differences among individual entities, hurting the robustness to potential noise involved in modalities (e.g., unidentifiable images and relations). In this paper we present MEAformer, a multi-modal entity alignment transformer approach for meta modality hybrid, to dynamically predict the mutual correlation coefficients among modalities for instance-level feature fusion. A modal-aware hard entity replay strategy is also proposed for addressing vague entity details. Extensive experimental results show that our model not only achieves SOTA performance on multiple training scenarios including supervised, unsupervised, iterative, and low resource, but also has limited parameters, optimistic speed, and good interpretability. Our code will be available soon.
translated by 谷歌翻译
Tensor robust principal component analysis (TRPCA) is a promising way for low-rank tensor recovery, which minimizes the convex surrogate of tensor rank by shrinking each tensor singular values equally. However, for real-world visual data, large singular values represent more signifiant information than small singular values. In this paper, we propose a nonconvex TRPCA (N-TRPCA) model based on the tensor adjustable logarithmic norm. Unlike TRPCA, our N-TRPCA can adaptively shrink small singular values more and shrink large singular values less. In addition, TRPCA assumes that the whole data tensor is of low rank. This assumption is hardly satisfied in practice for natural visual data, restricting the capability of TRPCA to recover the edges and texture details from noisy images and videos. To this end, we integrate nonlocal self-similarity into N-TRPCA, and further develop a nonconvex and nonlocal TRPCA (NN-TRPCA) model. Specifically, similar nonlocal patches are grouped as a tensor and then each group tensor is recovered by our N-TRPCA. Since the patches in one group are highly correlated, all group tensors have strong low-rank property, leading to an improvement of recovery performance. Experimental results demonstrate that the proposed NN-TRPCA outperforms some existing TRPCA methods in visual data recovery. The demo code is available at https://github.com/qguo2010/NN-TRPCA.
translated by 谷歌翻译
In knowledge graph completion (KGC), predicting triples involving emerging entities and/or relations, which are unseen when the KG embeddings are learned, has become a critical challenge. Subgraph reasoning with message passing is a promising and popular solution. Some recent methods have achieved good performance, but they (i) usually can only predict triples involving unseen entities alone, failing to address more realistic fully inductive situations with both unseen entities and unseen relations, and (ii) often conduct message passing over the entities with the relation patterns not fully utilized. In this study, we propose a new method named RMPI which uses a novel Relational Message Passing network for fully Inductive KGC. It passes messages directly between relations to make full use of the relation patterns for subgraph reasoning with new techniques on graph transformation, graph pruning, relation-aware neighborhood attention, addressing empty subgraphs, etc., and can utilize the relation semantics defined in the ontological schema of KG. Extensive evaluation on multiple benchmarks has shown the effectiveness of techniques involved in RMPI and its better performance compared with the existing methods that support fully inductive KGC. RMPI is also comparable to the state-of-the-art partially inductive KGC methods with very promising results achieved. Our codes and data are available at https://github.com/zjukg/RMPI.
translated by 谷歌翻译
多年来,Yolo系列一直是有效对象检测的事实上的行业级别标准。尤洛社区(Yolo Community)绝大多数繁荣,以丰富其在众多硬件平台和丰富场景中的使用。在这份技术报告中,我们努力将其限制推向新的水平,以坚定不移的行业应用心态前进。考虑到对真实环境中速度和准确性的多种要求,我们广泛研究了行业或学术界的最新对象检测进步。具体而言,我们从最近的网络设计,培训策略,测试技术,量化和优化方法中大量吸收了思想。最重要的是,我们整合了思想和实践,以在各种规模上建立一套可供部署的网络,以适应多元化的用例。在Yolo作者的慷慨许可下,我们将其命名为Yolov6。我们还向用户和贡献者表示热烈欢迎,以进一步增强。为了了解性能,我们的Yolov6-N在NVIDIA TESLA T4 GPU上以1234 fps的吞吐量在可可数据集上击中35.9%的AP。 Yolov6-S在495 fps处的43.5%AP罢工,在相同规模〜(Yolov5-S,Yolox-S和Ppyoloe-S)上超过其他主流探测器。我们的量化版本的Yolov6-S甚至在869 fps中带来了新的43.3%AP。此外,与其他推理速度相似的检测器相比,Yolov6-m/L的精度性能(即49.5%/52.3%)更好。我们仔细进行了实验以验证每个组件的有效性。我们的代码可在https://github.com/meituan/yolov6上提供。
translated by 谷歌翻译
基于语义空间中密集表示的检索模型已成为第一阶段检索的必不可少的分支。这些检索员受益于代表学习朝着压缩全球序列级嵌入的进步。但是,它们很容易忽略本地的显着短语和实体在文本中提到的,这些短语通常在第一阶段的检索中扮演枢轴角色。为了减轻这种弱点,我们提议使一个密集的检索器对齐一个表现出色的词典意识代表模型。对齐方式是通过弱化的知识蒸馏来实现的,以通过两个方面来启发猎犬 - 1)词汇扬声的对比目标,以挑战密集编码器和2)一个配对的等级正规化,以使密集的模型的行为倾向于其他人的行为。我们在三个公共基准上评估了我们的模型,这表明,凭借可比的词典觉得回收犬作为老师,我们提议的密集人可以带来一致而重大的改进,甚至超过教师。此外,我们发现我们对密集猎犬的改进是与标准排名蒸馏的补充,这可以进一步提高最先进的性能。
translated by 谷歌翻译
深度学习推荐模型(DLRMS)已广泛应用于互联网公司。DLRM的嵌入表太大,无法完全适合GPU内存。我们通过利用目标数据集的ID频率统计信息来动态管理CPU和GPU内存空间中的嵌入式表的基于GPU的软件缓存方法。我们提出的软件缓存以同步更新方式有效地在GPU上培训整个DLRM。它还与广泛使用的混合平行训练方法相结合,将其缩放到多个GPU。评估我们的原型系统表明,我们只能保留GPU中嵌入参数的1.5%,以获得体面的端到端训练速度。
translated by 谷歌翻译
视频识别是由端到端学习范式主导的 - 首先初始化具有预审预周化图像模型的视频识别模型,然后对视频进行端到端培训。这使视频网络能够受益于验证的图像模型。但是,这需要大量的计算和内存资源,以便在视频上进行填充以及直接使用预审计的图像功能的替代方案,而无需填充图像骨架会导致结果不足。幸运的是,在对比视力语言预训练(剪辑)方面的最新进展为视觉识别任务的新途径铺平了道路。这些模型在大型开放式图像文本对数据上进行了预测,以丰富的语义学习强大的视觉表示。在本文中,我们介绍了有效的视频学习(EVL) - 一种有效的框架,用于直接训练具有冷冻剪辑功能的高质量视频识别模型。具体来说,我们采用轻型变压器解码器并学习查询令牌,从剪辑图像编码器中动态收集帧级空间特征。此外,我们在每个解码器层中采用局部时间模块,以发现相邻帧及其注意力图的时间线索。我们表明,尽管有效地使用冷冻的骨干训练,但我们的模型在各种视频识别数据集上学习了高质量的视频表示。代码可在https://github.com/opengvlab/feld-video-rencognition上找到。
translated by 谷歌翻译
对地形信息有良好的了解对于改善复杂地形上各种下游任务的执行至关重要,尤其是对于腿部机器人的运动和导航。我们为神经城市地形重建提供了一个新颖的框架,并进行了不确定性估计。它通过稀疏的激光雷达观察结果在线生成密集的以机器人为中心的高程图。我们设计了一种新颖的预处理和点特征表示方法,可确保在整合多点云帧时确保高鲁棒性和计算效率。然后,贝叶斯gan模型恢复了详细的地形结构,同时提供了像素重建不确定性。我们通过广泛的模拟和现实世界实验评估了提议的管道。它在移动平台上展示了​​具有高质量和实时性能的有效地形重建,这进一步使腿部机器人的下游任务受益。 (有关更多详细信息,请参见https://kin-zhang.github.io/ndem/。)
translated by 谷歌翻译
基于观察到的图,对在关系结构数据上应用机器学习技术的兴趣增加了。通常,该图并不能完全代表节点之间的真实关系。在这些设置中,构建以观测图为条件的生成模型可以考虑图形不确定性。各种现有技术要么依赖于限制性假设,无法在样品中保留拓扑特性,要么在较大的图表中昂贵。在这项工作中,我们介绍了用于通过图形构建分布的节点复制模型。随机图的采样是通过替换每个节点的邻居的邻居来进行采样的。采样图保留图形结构的关键特征,而无需明确定位它们。此外,该模型的采样非常简单,并与节点线性缩放。我们在三个任务中显示了复制模型的有用性。首先,在节点分类中,基于节点复制的贝叶斯公式在稀疏数据设置中实现了更高的精度。其次,我们采用建议的模型来减轻对抗攻击对图形拓扑的影响。最后,将模型纳入推荐系统设置,改善了对最新方法的回忆。
translated by 谷歌翻译